Avatar

Arkaprabho Ghosh

Senior Solutions Architect, Generative AI

Cisco IT

Arkaprabho Ghosh is a Senior Solutions Architect and Generative AI Professional at Cisco, where he conceptualizes and architects enterprise-grade AI platforms that empower thousands of internal users to rapidly build and deploy intelligent applications. He is the lead architect behind Cisco's DRIFT ( https://drift-ai.cisco.com/ ) — a production-grade, modular framework that enables AI developers to create API-driven data pipelines with pluggable components including custom data loaders, embedding models, retrieval strategies, and reranker models. Ghosh specializes in Retrieval-Augmented Generation (RAG), LLM prompt engineering, and agentic AI architectures. Prior to his current role, Ghosh served as a Data Scientist at Cisco, where he delivered a series of high-impact AI solutions — including a Generative AI-based Recommendation Assistant that reduced engineering support effort , a product recommender system for SMB customers, and customer churn prediction models identifying high-risk customers. He also led the creation of a whitepaper on Causal Inference-based analytics. With over two decades of experience spanning Data Engineering, Machine Learning, and Solution Architecture — including leadership roles at Infosys supporting Cisco and Apple — Ghosh brings deep expertise in building data platforms, feature stores, and automation frameworks that deliver measurable business outcomes. He holds a Master of Science in Statistics from IIT Kanpur and is the recipient of the Cisco FY20 Data Science Award for Top Line Growth. He is passionate about translating advanced AI research into scalable, production-ready platforms that accelerate enterprise AI adoption.

Fine-Tuning Embedding Models for Enterprise Retrieval: A Practical Guide with NVIDIA Nemotron Recipe

5 min read

Cisco IT recently evaluated fine-tuning embedding models using NVIDIA Nemotron RAG fine-tuning recipe as part of an effort to improve retrieval accuracy for domain-specific enterprise data. The objective was not to redesign existing retrieval-augmented generation (RAG) systems, but to understand whether targeted embedding fine-tuning could materially improve semantic search quality with reasonable effort and fast turnaround. Through this experiment, Cisco was able to validate firsthand that embedding fine-tuning, combined with synthetic data generation, can deliver measurable accuracy gains within a short time frame. The experiment also demonstrated strong time-to-value, enabling rapid iteration and clear performance signals without long training cycles or extensive manual labeling. The reduced turnaround of only a few days to understand the immediate benefits was a key outcome of this collaboration. The embedding model training and evaluation workflow was executed on Cisco AI PODs running Cisco UCS 885A infrastructure powered by NVIDIA HGX platform.